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(57) Abstract 

Methods are described for the isolation and characterization of DNA sequences from Aspergillus niger var. awamori 
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Title: Process for producing/secreting a protein by a transformed mould using 
expression/secretion regulating regions derived from an Aspergillus 
endoxylanase II gene 

Background of the invention 

The invention relates to a process for the production and optionally secretion of a 
protein by means of a transformed mould, into which an expression vector has been 
introduced with the aid of recombinant DNA techniques known per se, said vector 
comprising one or more mould-derived expression and/or secretion regulating regions 
controlling the expression of a gene encoding said protein and optionally controlling 
the secretion of the protein so produced. Such a process is known from various 
publications, in which the production of proteins with the aid of transformed moulds 
is described. Thus, in the non-prior-published patent application PCT/EP 91/01135 
(UNILEVER, in the priority year published on 26 December 1991 as WO 91/19782) 
there is described, inter alia, the production of a homologous endoxylanase II protein 
by a transformed Aspergillus strain. 

Other ways of producing proteins by transformed moulds, in particular while using 
promoters originating from Aspergillus moulds, are known. 

Ward, M. et ai % (GENENCOR! 1990) have described the production by a trans- 
formed Aspergillus niger var. awamori of the milk-clotting enzyme chymosin or its 
precursor prochymosin. It was concluded that production of a fusion protein in which 
the prochymosin was connected with its N-terminus to the C-terminus of the 
Aspergillus protein glucoamylase gave a much higher secretion than with production of 
the prochymosin alone, whereby in both cases the protein was preceded by the 
glucoamtdase signal sequence and under control of the glucoamylase promoter. 
- In CA-A-202444cS (ALLELIX BIOPHARMACE) "Recombinant DNA expression 
construct - containing promoter for use in Aspergillus", published on 1 March 1991, 
the constitutive promoter of the Aspergillus nidulans aldehyde dehydrogenase gene 
and its use for the production of heterologous proteins in a transformed mould is 
described. 
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In EP-A-0436858 (GREEN CROSS CORP.) "Promoter of glyceraldehyde-3-phos- 
phate dehydrogenase (GAPDH) gene - derived from Aspergillus orizae, used in new 
expression svstem in yellow-green or black koji mould", published on 17 July 1991, the 
use of the promoter and terminator of the GAPDH gene in a vector for transforming 
5 a mould to produce foreign proteins is described, 
and 

In EP-A-0439997 (CIBA GEIGI AG) "A. niger pyruvate kinase promoter - used 
to construct vectors for expression of structural genes in suitable hosts", published on 
7 August 1991. the overproduction of a homologous gene product or a heterologous 
10 gene product in A. niger is described. 

Moulds are organisms frequently used in the production of proteins and metabolites. 
A biotechnologically very important aspect of moulds is that they are capable of very 
efficient protein production and, if desired, secretion into the medium. It is also 

15 possible to grow moulds in a properly controlled way in large bioreactors. The 

combination of the possibilities of generating fungal biomass cost-effectively by means 
of fermentation and the high specific expression per cell make moulds exceptionally 
interesting hosts for the production of both heterologous and homologous proteins. 
For efficient production of these heterologous and homologous proteins, the use of an 

9 0 efficient promoter effective in moulds is essential. For secretion of a protein into the 
medium, specific sequences are required that cater for this. In connection with 
possible toxicitv for the mould cell of the protein to be produced, it is also important 
that the activity of a promoter can be regulated, i.e. turned on at suitable moments, 
' thus an inducible promoter is preferred. 



25 



Summary of the invention 

The invention is based on the use of a non-prior-published promoter, which is 
described in more detail below, as well as on the use of other expression and/or 
secretion regulating regions, such as a terminator, a DNA sequence encoding a signal 
30 sequence, and a DNA sequence encoding at least an essential part of a mature 
endogenous mould protein. 
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In studies of the expression of proteins in moulds it was found that the enzyme endo- 
xylanase type II (exlA) was efficiently produced after induction of expression of the 
exlA gene, and was also secreted efficiently into the medium. For production of that 
protein, the encoding gene was cloned together with its own promoter. In comparison 
5 with other mould promoters, the endoxylanase II promoter proved particularly 

efficient. Expression of the gene encoding the endoxylanase II enzyme (regulated by 
its own promoter) was found to be efficiently induced with various media 
components, including wheat bran, xylan and xylose. This induction was found to 
proceed efficiently in different mould strains (see WO 91/19782, UNILEVER). 

10 This provided an opportunity to obtain an efficient inducible promoter as well as 
other mould-derived expression and/or secretion regulating regions, including 
transcription terminator signals and secretion signals, which might perhaps be used 
for the production of heterologous and homologous proteins in moulds. The promoter 
fragment, terminator fragment and secretion signals of the Aspergillus niger var. 

15 awamori endoxylanase II gene were cloned and subsequently further defined. 

The E. coli B-glucuronidase gene (uidA) was used as an example of the production of 
a heterologous protein in a transformed mould. The promoter and terminator sequen- 
ces of the Aspergillus niger var. awamori endoxylanase II gene were used for the 
construction of an expression vector. With the aid of this expression vector a 

20 heterologous gene encoding the £. coli protein 6-glucuronidase was expressed in 

moulds under control of the endoxylanase II (exlA) promoter. By using exlA secretion 
signals, the heterologous and homologous proteins can also be secreted. 
As another example of the use of the exlA promoter and terminator for the 
■ production of heterologous proteins, a gene encoding & Thennomyces lanuginosa 

25 lipase was introduced in the expression vector under the control of exlA regulatory 
sequence^ and used for the production and secretion of Thennomyces lanuginosa 
lipase. NOVO-NORDISK, an enzyme manufacturing company in Denmark, is mar- 
keting under the trade name "Lipolase" a lipase derived from Thennomyces 
lanuginosa, but produced by another microorganism. To illustrate that the exlA signal 

30 sequence can be used to direct the secretion of proteins other than exlA, a DNA 

sequence encoding a Thennomyces lanuginosa mature lipase amino acid sequence was 
fused to the exlA signal sequence and placed under the control of the exlA regulatory 
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sequences in the above mentioned expression plasmid, and secretion of Thennomyces 
lanuginosa lipase was demonstrated. 

Of course, the heterologous genes, of which the expression is exemplified in this 
specification, can be replaced by any DNA sequence encoding a desired protein 
5 (coding for enzymes, proteins, etc.) originating from a wide range of organisms 
(bacteria, yeasts, moulds, plants, animals and human beings) so that the desired 
protein can be produced by moulds. 

Thus in one embodiment of the invention a process is provided for the production in 
transformed moulds of proteins other than endoxylanase type II using expression 

10 regulating sequences derived from the Aspergillus niger van awamori endoxylanase II 
(exlA) gene, such as the promoter or the terminator, or functional derivatives of these 
regulatory sequences. In another embodiment of this invention, a process is provided 
by^hich proteins produced in moulds, if desired, can be secreted in the medium by 
making use of the DNA sequence encoding the signal sequence, in particular the pre- 

15 sequence or prepro-sequence, of the Aspergillus niger var. awamori endoxylanase II 
gene or functional derivatives of these sequences. Finally the invention also provides 
a process for producing a protein in which a vector is used comprising at least an 
essential part of the DNA sequence encoding the mature endoxylanase II protein, 
because it is known that in moulds an improved secretion of a heterologous protein 

20 can be obtained bv initially producing it as a fusion protein comprising part of an 
endogenous mould protein (see also the Ward, M. et al / GENENCOR reference 
mentioned above). 

Brief description of the Figures and Tables 

25 FjgU shows the DNA sequence of the ca 2.1 kb Pstl-Pstl fragment of Aspergillus niger 
var. awamori present in the plasmid pAW14B, which fragment contains a gene coding 
for an endoxylanase II. indicated as the exlA gene. The translation start and the stop 
codon are doubly underlined. The 49 bp intron is underlined.' The N-terminal end of 
the mature protein is indicated. The amino acid sequence of the protein (both of the 

30 pre(pro) form and of the mature protein) is indicated using the one-letter code. 
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Fig. 2 shows the restriction map of the genomic DNA region of Aspergillus niger var. 
- awamori, comprising the exIA gene cloned in the phages lambda 1 and lambda 14. 
The used abbreviations stand for: S: Sail: E: EcoRl\ H: Hindlll\ P: Pstl\ P*: Pst\\ 
B: flflmHI; S#: Sail site originating from the polylinker of lambda-EMBL3; and 
5 D: Sau3A. The solid bar indicates a 1.2 kb Pstl*-BamHl fragment hybridizing with 
Xyl06. P* and Pstl* symbols are used to distinguish the two Pstl sites present. 

Fig. 3 shows the plasmid PAW14B obtained by insertion of the 5.3 kb Sail fragment 
comprising the exIA gene of Aspergillus niger var. awamori in the Sail site of pUCl9. 

10 

Fig. 4 shows the plasmid pAWl5-l obtained by displacing the BspHl-Aflll fragment 
comprising the exIA open reading frame in pAW!4B with a Ncol-Aflll fragment 
comprising the E. coli uidA coding sequence. Thus, plasmid pAW15-l comprises the 
E. coli uidA gene under the control of the A. niger var. awamori promoter and 
15 terminator. 

Fig. 5 shows plasmid pAWl5-7 obtained by inserting a 2.6 kb Notl fragment 
comprising the £. coli hygromycin resistance gene controlled by the A. nidulans gpdA 
promoter and the A. nidulans trpC terminator in the EcoRl site of pAWl5-l. 

20 

Fig. 6 shows plasmid pAWTLl obtained by displacing the BspHl-Aflll fragment 
comprising the exIA open reading frame in pAW14B with a BspHl-Aflll fragment 
comprising a nucleotide sequence encoding the T. lanuginosa lipase together with its 
own pre-pro-sequence. Thus, plasmid pAWTLl comprises the T. lanuginosa lipase 
25 gene together with its own pre-pro-sequence encoding region under the control of the 
A. niger vtar. awamori promoter and terminator. 

Fig. 7 shows plasmid pAWTL2 obtained by displacing the Nrul-Aflll fragment 
comprising the region encoding the mature exIA protein in pAW14B with a Nrul-Aflll 
30 fragment comprising a nucleotide sequence encoding the mature part of the T. 

lanuginosa lipase. Thus, plasmid pAWTL2 comprises the T. lanuginosa lipase gene 
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fused to the exlA pre-pro-sequence encoding region under the control of the A. niger 
var. awamori promoter and terminator. 

Fig. 8 shows plasmid pTLl comprising a nucleotide sequence encoding the T. 
5 lanuginosa lipase together with its own pre-pro-sequence under the control of the A. 
niger gpdA promoter and the A. niduians trpC terminator inserted in the polylinker of 
pUC18. The region encoding the pre-pro-sequence of the T. lanuginosa lipase is 

indicated by "ss" 

10 Fjo_9 shows the sequence comprising the open reading frame encoding the T. 

lanuginosa lipase as it is contained within plasmid pTLl. The N-terminal end of the 

mature protein is indicated. 

Table A shows various probes derived from the N-terminal amino acid sequence of 
15 the endoxylanase II protein. These probes were used for the isolation of the exlA 
gene, see item 1.1 of Example 1. 

The number of oligonucleotides present in the "mixed" probe is indicated in brackets; 
this number is obtained by including 1, 2, 3 or 4 different bases in every third 
position, depending on the number of codons for an amino acid. In Xyl04 nucleotides 

-><> were selected on the basis of the hybridization G-C and G-T and/or on the basis of 
the preferred codons in Aspergillus niger glucoamylase. In Xyl05 and Xyl06 not all of 
the possibly occurring bases are introduced at the third position of the codons in 
order not to obtain more than 256 oligonucleotides in the mixture. The sequence of 
the oligonucleotides is complementary to that of the coding strand of the DNA, which 

25 resembles the corresponding mRNA. 

XylOl: t a mixture of 256 oligos having a length of 23 deoxynucleotides the 

sequence of which is complementary to the part of the coding strand 
coding for the amino acids 5-12. 
Xyl04: an oligo having a length of 47 deoxynucleotides the sequence of which is 

30 complementary to the part of the coding strand coding for the amino acids 

2-17. 
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Xyl05: a mixture of 144 oligos having a length of 23 deoxynucleotides the 

sequence of which is complementary to the part of the coding strand 
coding for the amino acids 10-17. 

Xyl06: . a mixture of 256 oligos having a length of 47 deoxynucleotides the 
5 sequence of which is complementary to the part of the coding strand 

coding for the amino acids 2-17. 

Table B shows various single-stranded subclones of lambda 1 and lambda 14 

fragments, which were used for determination of the sequence of the exlA gene, see 

item 1.2 of Example 1. 
10 Table C shows the results of E. coli 6-glucuronidase production by non-transformed 

and transformed strains of the mould Aspergillus niger var. awamori, see item 2.2 of 

Example 2. 

Table D shows that functional lipase was produced and* secreted after induction of the 
exlA promoter by xylose, and that the secretion of a heterologous {Thennomyces 
15 lanuginosa), mature protein was directed in Aspergillus niger var. awamori by using 
either the exlA signal sequence (see AWLPL2-2) or the Thennomyces lanuginosa 
signal sequence (see AWLPL1-2), see item 3.2 of Example 3. 

Table E shows various nucleotide sequences of oligonucleotides used in constructions 
described in Examples 1-3, see items 1.4, 2.1, 3.1.1, 3.1.2, 3.1.3, and 3.1.4. The 
20 sequence listing numbers refer to the listings provided in the official format. 

Detailed description of the invention 

Since the endoxylanase II gene is expressed and the resulting protein is secreted very 
efficiently under appropriate cultivation conditions by Aspergillus niger var. awamori, 

25 the present invention is directed in particular to the cloning of the regulatory regions 
of the Aspergillus niger var. awamori endoxylanase II (exlA) gene, such as the 
promoter sequence, terminator sequence and signal sequence, and using these com- 
ponents for the development of a process for the production of proteins in moulds. 
The invention therefore relates generally to a process making use of a nucleic acid 

30 sequence derivable from a mould and comprising at least a regulatory region 

derivable from a gene encoding a polypeptide having endoxylanase II activity. Said 
nucleic acid sequence can be combined with nucleic acid sequences encoding other 
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homologous or heterologous genes to bring these genes under the control of at least 
one exlA regulatory sequence. 

"Nucleic acid sequence" as used herein refers to a polymeric form of nucleotides of 
any. length, both to ribonucleic acid (RNA) sequences and deoxyribonucleic acid 
5 (DNA) sequences. In principle this term refers to the primary structure of the 

molecule. Thus this term includes both single and double stranded DNA, as well as 
single stranded RNA and modifications thereof. 

In general the term "protein" refers to a molecular chain of amino acids with a 
biological activity and does not refer to a specific length of the product and if 
10 required can be modified in vivo or in vitro. This modification can for example take 
the form of amidation, carboxylation, glycosylate, or phosphorylation; thus inter alia 
peptides, oligopeptides and polypeptides are included. In this specification both terms, 
polypeptide and protein, are used as synonyms unless a specific meaning is clear from 
the context. 

15 The invention also relates to the use of a vector containing the nucleic acid sequences 
as described for the production of proteins other than Aspergillus niger var. awamori 
endoxylanase II (exlA) and also relates to the use of micro-organisms containing said 
vectors or nucleic acid sequences for producing said proteins. 

The invention is also directed at the use of modified sequences of the aforementioned 
20 nucleic acid sequences according to the invention for the production of proteins other 
^Aspergillus niger var. awamori endoxylanase II (exlA), said modified sequences 
also having regulatory activity. The term "a modified sequence" covers nucleic acid 
sequences having the regulatory activity equivalent to or better than the nucleic acid 
sequence derivable from a mould and comprising at least a regulatory region 
25 derivable from a gene encoding a protein having endoxylanase U activity. Such an 
equivalent nucleic acid sequence can have undergone substitution, deletion or 
insertion or a combination of the aforementioned of one or more nucleotides 
resulting in a modified nucleic acid sequence without concomitant loss of regulatory 
activitv occurring. Processes for the production of proteins other than Aspergillus niger 
30 van awamori endoxylanase II (exlA) using such modified nucleic acid sequences fall 
within the scope of the present invention. In particular processes for the production of 
proteins other than Aspergillus niger var. awamori endoxylanase II (exlA) using 
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modified sequences capable of hybridizing with the non-modified nucleic acid 
sequence and still maintaining at least the regulatory activity of the non-modified 
nucleic sequence fall within the scope of the invention. 

The expression "functional derivatives" used in the claims refers to such modified 
5 sequences. 

The term "a part of covers a nucleic acid sequence being a subsequence of the 
nucleic acid sequence derivable from a mould and comprising at least a regulatory 
region derivable from a gene encoding a polypeptide having endoxylanase II activity. 
In particular the invention is directed at a process using a nucleic acid sequence 
0 derivable from a mould of the genus Aspergillus. A suitable example of a mould from 
which a nucleic acid sequence according to the invention can be derived is an 
Aspergillus of the species Aspergillus niger, in particular Aspergillus niger var. awamori. 
.In particuJar the strain Aspergillus niger var. awamori CBS 115.52 (ATCC 11358) is 
eminently suitable for deriving a nucleic acid sequence according to the invention. 
> Preferably the nucleic acid sequence for use in a process according to the invention 
comprises at least a promoter as regulatory region. The nucleic acid sequence for use 
in a process according to the invention can also comprise an inducer or enhancer 
sequence enabling a higher level of expression of any nucleic acid sequence operably 
linked to the promoter. It is also possible for the nucleic acid sequence for use in a 
process according to the invention to comprise a termination signal as regulatory 
region. The nucleic acid sequence for use in a process according to the invention can 
comprise one or more regulatory regions. A nucleic acid sequence for use in a 
process according to the invention can comprise solely the promoter as regulatory 
region or a combination thereof with an enhancer or other functional elements. A 
nucleic acid sequence for use in a process according to the invention can also further 
comprise jerminator sequences, although these are not always required for efficient 
expression of the desired expression product. 

According to a further embodiment of the invention a nucleic acid sequence for use 
in a process according to the invention can further comprise a sequence encoding a 
secretory signal necessary for secreting a gene product from a mould. This will be 
preferred when intracellular production of a desired expression product is not 
sufficient and extracellular production of the desired expression product is required. 
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Secretory signals comprise the prepro- or pre-sequence of the endoxylanase II gene 
for example. A secretory signal derivable from the endoxylanase II gene of an 
Aspergillus mould is particularly favoured. The specific embodiment of the nucleic 
acid sequence used in a process according to the invention will however depend on 
5 the goal that is to be achieved upon using a process according to the invention. 

"Signal sequence" as used herein generally refers to a sequence of amino acids which 
is Tesponsible for initiating export of a protein or polypeptide chain. A signal 
sequence, once having initiated export of a growing protein or polypeptide chain, can 
be cleaved from the mature protein at a specific site. The term also includes leader 
10 sequences or leader peptides. The preferred signal sequence herein is the deduced 
signal sequence from the Aspergillus niger var. awamori endoxylanase II gene given in 
Fig. 1. 

With the help of DNA oligonucleotides deduced from protein sequence analysis of 
15 endoxylanase II from Aspergillus niger var. awamori chromosomal DNA fragments 
comprising the entire endoxylanase II (exlA) gene of Aspergillus niger var. awamori 
including the regulatory regions, such as the promoter, the signal sequence and the 
termination sequence have been isolated from a genomic library. The regulatory 
regions of the endoxylanase II (exlA) gene have been used for the production and, if 
20 desired, secretion of proteins other than endoxylanase II, e.g. heterologous proteins, 
by Aspergillus niger var. awamori. The invention is therefore in particular directed at a 
process in which one or more of the regulatory regions of the Aspergillus niger var. 
awamori endoxylanase II gene or equivalent nucleic acid sequences are used for the 
production of proteins other than endoxylanase II in Aspergillus niger™. awamori. 
25 The term "equivalent nucleic acid sequence" has the same meaning as given above for 
"a modified nucleic acid sequence". 

In the Examples given below the expression and secretion potential of the obtained 
exlA promoter and the exlA signal sequences have been tested by constructing new 
vectors for expression of a heterologous 8-glucuronidase gene and the production and 
30 secretion of a heterologous lipase in Aspergillus. The resulting constructs were tested 
in Aspergillus niger var. awamori. 
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Thus in a general form the invention provides a process for the production and 
optionally secretion of a protein different from the endoxylanase type II protein ex 
Aspergillus niger var. awamori by means of a transformed mould, into which an 
expression vector has been introduced with the aid of recombinant DNA techniques 

5 known per se, said vector comprising mould-derived expression and/or secretion 
regulating regions, in which process at least one of said expression and/or secretion 
regulating regions is selected from (1) the expression and secretion regulating regions 
of the endoxylanase II gene (exlA gene) of Aspergillus niger var. awamori present on 
plasmid pAW14B (Figure 3), which is present in a transformed E. coli strain JM109 

10 deposited at the Centraalbureau voor Schimmelcultures in Baarn, The Netherlands, 
under N° CBS 237.90 on 31 May 1990, and (2) functional derivatives thereof also 
having expression and/or secretion regulating activity. 

In a preferred embodiment of the invention the selected expression regulating region 
15 is a promoter and said vector comprises a gene encoding said protein under control 
of said promoter, the latter being selected from (1) the promoter of the endoxylanase 
II gene (exlA gene) of Aspergillus niger var. awamori present on plasmid pAW14B 
(Figure 3), which is present in a transformed E. coli strain JM109 deposited at the 
Centraalbureau voor Schimmelcultures in Baarn, The Netherlands, under N° CBS 
20 237.90 on 31 May 1990, and (2) functional derivatives thereof also having promoter 
activity. More preferably said promoter is equal to the promoter present on the 5' 
part upstream of the exlA gene having a size of about 2.5 kb located between the Sail 
restriction site at position 0 and the start codon ATG of the exlA gene in plasmid 
pAW14B, in particular said promoter comprises at least the polynucleotide sequence 

25 1-350 according to Figure 1. 

This promoter can be induced by wheat bran, xylan, or xylose, or a mixture of any 
combination thereof, present in a medium in which the transformed mould is 
incubated, whereby the use of xylose as inducing agent is preferred. 

30 In another preferred embodiment of the invention the selected expression regulating 
region is a terminator and said vector comprises a gene encoding said protein 
followed by said terminator, the latter being selected from (1) the terminator of the 
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endoxvlana.se II gene (exlA gene) oi Aspergillus niger var. awamori present on plasmid 
pAW14B (Figure 3), which is present in a transformed E. coli strain JM109 deposited 
at the Centraalbureau voor Schimmelcultures in Baarn, The Netherlands, under N° 
CBS 237.90 on 31 May 1990, and (2) functional derivatives thereof also having 
5 terminator activity. Preferably, said terminator is equal to the terminator present on 
the 3' part downstream of the exlA gene having a size of about 1.0 kb located right 
downstream of the stop codon (TAA) of the exlA gene in plasmid pAW14B. 

A further embodiment of the invention is a process for the production and secretion 

10 of a protein different from the endoxylanase type II protein ex Aspergillus niger var. 
awamori by means of a transformed mould, in which process the selected secretion 
regulating region is a DNA sequence encoding a signal sequence and said vector 
comprises a gene encoding said protein preceded by said DNA sequence encoding a 
signal sequence, the latter being selected from (1) the DNA sequence encoding the 

15 signal sequence of the endoxylanase II gene (exlA gene) of Aspergillus niger var. 

awamori present on plasmid P AW14B (Figure 3), which is present in a transformed E. 
coli strain JM109 deposited at the Centraalbureau voor Schimmelcultures in Baarn, 
The Netherlands, under N° CBS 237.90 on 31 May 1990, and (2) functional deriva- 
tives thereof also directing secretion of the protein. Preferably, the gene (1) encoding 

20 said protein is also preceded by at least an essential part of a DNA sequence (2) 
encoding the mature endoxylanase II protein, whereby said DNA sequence (2) is 
present between said DNA sequence encoding a signal sequence (3) and the gene (1). 
A preferred signal sequence is the signal sequence encoded by polynucleotide 351-431 
of the DNA sequence given in Figure 1, which polynucleotide precedes the DNA 

25 sequence in plasmid pAW14B encoding the mature exlA polypeptide. 

Summarizing, in a process for producing a protein according to the invention the 
vector used for transforming a mould can comprise an exlA-derived promoter as 
hereinbefore described or a exlA-derived terminator as hereinbefore described or a 
exlA-derived signal sequence as hereinbefore described or at least an essential part of 

30 the exlA structural gene, or any combination of these expression and/or secretion 
regulating regions. 
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The invention is illustrated with the following Examples without being limited thereto. 
AJ1 techniques used for the manipulation and analysis of nucleic acid materials were 
performed essentially as described in Sambrook et al. (1989), exxept where indicated 
otherwise. 

5 

Example 1 Cloning and characterization of the endoxylanase II gene (exlA) 

and associated regulating sequences of Aspergillus niger var. 
awamori 

1.1 Isolation of the Aspergillus nicer var. awam ori exlA gene 
10 In order to isolate the exlA gene from chromosomal DNA of Aspergillus niger var. 
awamori different probes were synthesized consisting of mixtures of oligonucleotides 
(Table A). The composition of these mixtures was derived from the N-terminal amino 
acid sequence of purified endoxylanase II protein. 

By means of Southern blot analysis it was established that in digests of chromosomal 
15 DNA - under stringent conditions - only one band hybridizes with the probes used. In 
the £coRI, Sail and BamHl digest of Aspergillus niger var. awamori DNA one band of 
respectively 4.4, 5.3 and 9.5 kb hybridizes with both XylOl, Xyl04 and Xyl06. With 
Xyl05 no clear signal was found at 41°C. On the basis of this result a gene bank of 
Aspergillus niger var. awamori DNA was hybridized at 65°C with the oligonucleotide 
20 mixture Xyl06 as a probe. Of the 65000 tested plaques (corresponding to 32 times the 
genome) three plaques (lambda 1, 14 and 63) hybridized with this probe. After 
hybridization of digests of lambda 1 and lambda 14 DNA with Xyl06 a hybridizing 
band of >10 kh was found in the £coRI digest of lambda 1. The size of the 
hybridizing band in the lambda 14 and the chromosomal EcoRl digest was 4.4. kb. In 
25 the Sail digest of lambda 1 a 4.6 kb band hybridizes; in the Sail digest of lambda 14 
this is, like in chromosomal DNA, a 5.3 kb band. Also a 1.2 kb Pstl-BamHl fragment 
(Fig. 2) hybridizes with Xyl06. On the basis of restriction patterns with different 
enzymes and cross-hybridization of lambda 1 and lambda 14 digests with the 5.3 kb 
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amino ac ^ sequence 

Lr fll a SXyl^^ val am «n g- PIT »■ *~ **P 

Base spquence 3 ' — ■ " 

10 XylOl 23 5-12 ^ ^ ^ TTA cc 

G G C G G G 

Xyl °£ GG ccVtag ttg 7 atg cag gtc ttg'atg ttg ccg txg gac ccg ctg aa 

Xyl05 23 10-17 ^ATG TTG CCA TTA AAX CCA CTG AA 

G G G G 
C c 

XylOS 47 2-17 ttcATG TTG CCG TTG GAG CCG CTG AA 

CGG CCG TAG TTG ATG CAG GTC TTG Aio c C C 

C C- C C T 



15 



20 



25 



X = A, G, C or T 



30 Table A Probes derived from the N-terminal amino acid sequence of the 
endoxylanase II protein 

Renumber of ol^^ 
35 this number is obtained by including 1 ^ or * f ™ acid> In Xy 104 a G was 
position, depending on the number <* ™" ^ d/or on the basis of the 

selected on the basis of the hybridization G-C and G-T and/or on 

coding for the amino acids ^ ^ ^ nce of which j s 

-hTpart onSing strand coding for the amino acids 

^^^^ 

SO coding for *? J^^^ViengU. of 47 deoxynudeotides the 

Xyl ° 6: se" olwn^fco^entaryTo the par, of the coding strand 
coding for the amino acids 2-17. 



40 

XylOl: 

45 Xyl04 

XylOS 
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Sail fragment of lambda 14 it was confirmed that these lambda's contained 
overlapping fragments of the genome of Aspergillus niger var. awamori. AJso 
homologous hybridization of total induced RNA with respectively lambda 1. lambda 
14 and the 5.3 kb Sail fragment of lambda 14 confirmed the presence of exlA 
5 sequences on these lambda's. Hybridization was found with a x 7 lan-induced MRNA of 
ca. 1 kb. The size thereof corresponds to that of the mRNA molecule hybridizing with 
Xyl06. 

i *? Siihrlnniiw of ir"» Aspavillus nwe.r var. awamori exlA gene 
10 The Sail fragments hybridizing with Xyl06 of respectively lambda 1 (4.6 kb) and 

lambda 14 (5.3 kb) were cloned in two orientations in the Sail site of P UC19, which 
resulted in respectively plasmid pAWl (A and B) and plasmid pAW14 (A and B, see 
Fig. 3). The 1.2 kb Pstl'-BamHl fragment hybridizing with Xyl06 and the adjacent 1.0 
kbBflmHI-ft/I fragment from respectively pAW14A and pAWlA were subcloned 
15 into M13mpl8 and Ml3mpl9 cut with Bam HI and Pstl, resulting in the ml8/ml9 
AW vectors of Table B. 



Fragment 



25 pAW 1A BamHl-Pstl* (1.2 kb) 
pAW14A BamHI-Ps/I* (1.2 kb) 
pAW 1A ftfl-BamHI (1.0 kb) 
pAW14A ftfl-BamHI (1.0 kb) 



Resulting vectors 



ml8AW 1A-1 / ml9AW 1A-1 
ml8AW14A-l / ml9AW14A-l 
ml8AW 1A-2 / ml9AW 1A-2 
ml8AW14A-2 / ml9AW14A-2 



Tahlo R Sinpl^tmnded C lones of lambda 1 and lamhria 14 fragments 
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i ^ Termination nf the Transc ription direction of the exlA gene. 
The transcription direction of the exlA gene was established by means of spot blot 
hybridization of ss-DNA of respectively ml8AW14A-l and ml9AW14A-l with Xyl06. 
It was found that ss-DNA of ml9AW14A-l (5'-Pstl*-BamHl-3') hybridizes with this 
5 probe. Because the sequence of Xyl06 is equal to that of the non-coding strand, 
ml9AW14A-l contains the coding strand. On the basis thereof the transcription 
direction shown in Fig. 2 was determined. This direction is confirmed by the results of 
a primer extension experiment. 

10 1 .4 Identification of th e exlA gene 

The DNA sequence of a part of the promoter region was determined by sequence 
analysis of pAW14B with Xyl06 as a primer (5' part of the gene). In this region a 
primer Xylll (see Table E) was selected, with which the DNA sequence of com- 
plementary strand of m!8AW14A-l and ml8AWlA-l was determined. The results 

15 showed that these vectors contained a DNA sequence which was substantially equal 
to that of Xyl06. while the amino acid sequence derived from the base pair sequence 
was identical with the N-terminal amino acid sequence of the mature endoxylanase II 
protein. Thus the cloning of at least the 5" end of the exlA gene had been proven. 
The presence of the entire exlA gene in the vectors pAW14 and pAWl seemed 

20 plausible on the basis of the position of the 5' end of the gene on the Sofl fragments 
(Fig. 2) and the size of the exl A mRNA (ca. 1 kb). 

1.5 Sequence analysis 

The nucleotide sequence of the exlA gene and surrounding regions was established in 
25 two directions in both the ml3AW14 and the ml3AWl subclones by means of the 
dideoxy procedure (Sanger et al, 1977). The sequence around the BamHI site located 
downstream of the PstF site (Fig. 2) was established by sequence analysis of double- 
stranded P AW14 and pAWl DNA. Compressions were cleared up by using dITP 
instead of dGTP. In the independent clones lambda 1 and lambda 14 an identical 
30 exlA sequence was established. The complete nucleotide sequence of the 2.1 kb Pstl*- 
Pstl fragment comprising the entire pre(pro) endoxylanase H gene and the promoter 
and terminator sequences of the endoxylanase II gene is shown in Fig. 1. The mature 
endoxvlanase II protein is preceded by a leader peptide of 27 amino acids. A 
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predicted recognition site for the signal peptidase is present between the alanine 
residues at the positions 16 and 17 (..-T-A-F-A- I-A-P-V-..) (Van Heijne, 1986). From 
the length of the leader peptide it can be derived that in the protein a second 
processing site is present. Cleavage of the bond between Arg (27) and Ser (28) 
5 presumably is performed by a KEX2-Iike endoprotease (Fuller et al M 19SS). 

1.6 Localization of the intron 

In the exlA gene the presence of an intron of either 49 or 76 bp (581-629 or 581-656, 
see Fig. 1) was predicted on the basis of the presence of sequences corresponding to 
"donor" and "acceptor" sites of introns in aspergilli. Definite proof of the absence of a 
76 bp intron was obtained by isolation of an endoxylanase II derived peptide with the 
sequence Tyr-Ser-Ala-Ser-Gly... This peptide can only be localized in the protein 
starting from nucleotide position 652 (see Fig. 1). Therefore, the exlA gene comprises 
a single, 49 bp intron (position 581-629, see Fig. 1). 

1.7 Determination of the 3' end of the exlA gene 

The position of the stop codon of the exlA gene (position 1033-1035 in Fig. 1) was 
derived from DNA sequence data. This stop codon was confirmed, since the amino 
acid sequence of one of the peptides derived from endoxylanase II by chemical 
cleavage with CNBr proved to be identical to the C-terminal amino acid sequence 
derived from DNA sequence data (position 991-1032 in Fig. 1). 

1.8 Evaluation of DNA and protein data 

On the basis of the above data it was established that the gene coding for 
25 endoxylanase II of Aspergillus niger var. awamori had been cloned on a 5.3 kb Sail 
fragment The DNA sequence of the gene, the position of the intron and the length 
of the MRNA were established. The established N-terminal amino acid sequence of 
the mature protein was fully confirmed by the DNA sequence. On the basis of the 
above data it can be concluded that the exlA gene codes for a protein of 211 amino 
30 acids and that the first 27 amino acids are removed post-translationally. From this 
data the exlA signal sequence was derived. 

Also, the nucleotide sequence of the exlA promoter follows from the obtained 
sequence (see Fig. 1, position 1-350). Also, the nucleotide sequence of the exlA 
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terminator follows from the obtained sequence (see Fig. 1. position 1036-2059 or a 
part thereof). 

Example 2 Expression of the Escherichia coli B-glucuronidase (uidA) gene 

5 using the exlA promoter and terminator sequences. 

2.1 Construrrmn of the m*HA expression vector 

The uidA expression plasmid (pAW15-l) was constructed starting from plasmid 
pAW14B, which contains a ca. 5.3 kb Sail fragment on which the 0.7 kb endoxylanase 
II (exlA) gene is located, together with 2.5 kb of 5'-flanking sequences and 2.0 kb of 
10 3-flanking sequences (Fig.3). In P AW14B the exlA coding region was replaced by the 
uidA coding region. A BspHl site (5'-TCATGA-3') comprising the first codon (ATG) 
of the exlA gene and an Aflll site (y-CTTAAG-3'), comprising the stopcodon (TAA) 
of the exlA aene facilitated the construction of pAW15-l. 

The construction was carried out as follows: pAW14B (7.9 kb) was cut partially with 

15 BspHl (pA W14B contains five BspHl sites) and the linearized plasmid (7.9 kb) was 
isolated from an agarose gel. Subsequently the isolated 7.9 kb fragment was cut with 
Bsml, which cuts a few nucleotides downstream of the BspHl site of interest, to 
remove plasmids linearized at the other BspHl sites. The fragments were separated 
on an agarose gel and the 7.9 kb BspHl-Bsml fragment was isolated. This was 

20 partially cut with Aflll and the resulting 7.2 kb BspHl-Aflll fragment was isolated. 
The uidA gene was isolated as a 1.9 kb Ncol - Aflll fragment from pNOM-^/ffl, a 
plasmid derived from pNOM102 (Roberts et al.. 1989). In P NOM102 two Ncol sites 
are present, one of which is located at the 5'-end of the uidA gene and comprises the 
ATG-startcodon for translation of the gene. The second Ncol site is located a few 

25 nucleotides downstream of the stopcodon. To obtain an Aflll site downstream of the 
uidA stopcodon the latter Ncol site was converted into an Aflll site: P NOM102 was 
cut partiallv with Ncol and ligated with a Ncol - AflU linker (Nco-Afl, see Table E), 
resulting in vector pNOM-^II. The 7.2 kb BspHl - Aflll fragment of P AW14B was 
ligated to the 1.9 kb Ncol - Aflll fragment of pNOM-^II to give vector pAW15-l 

30 (Fig. 4). 

The constructed vector (pAW15-l) can subsequently be transferred to moulds (for 
example Aspergillus niger, Aspergillus niger var. awamori, Aspergillus nidulans etc.) by 
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means of conventional co-transformation techniques and the 6-glucuronidase can then 
be expressed via induction of the endoxylanase II promoter. The constructed vector 
can also be provided with conventional selection markers (e.g. amdS or pyrG, 
hygromycin etc.) and the mould can be transformed with the resulting vector to 

5 produce the desired protein. As an example, the E. coli hygromycin selection marker 
was introduced in the uidA expression vector, yielding P AWl5-7 (Fig. 4). For this 
purpose a fragment containing the E. coli hygromycin resistance gene controlled by 
the Aspergillus nidulans gpd A promoter and the Aspergillus niclulans trpC terminator 
was used. This cassette was isolated as a 2.6 kb Not\ fragment from P Bluekan7-l in 

10 which the hygromycin resistance cassette is flanked by Nod sites. In pAWl5-l a Not\ 
site was created by converting the EcoRI site (present 1.2 kb upstream of the ATG 
codon) into a Not\ site using a synthetic oligonucleotide (Eco-Not, see Table E), 
yielding pAWl5-l-Not. The 2.6 kb Not\ fragment from P AWBluekan7-l was isolated 
and ligated with Ato/Minearized pAW15-l-Not. The resulting vector was called 

15 pAW15-7 (Fig.5). 

7 7 Production of F. c.nli B-glnmronidase driven bv exlA expression signals 
pAW15-7 was used to transform Aspergillus niger var. awamori. Transformant AW15.7- 
1 was identified by hygromycin selection and by Southern hybridization analysis of 
20 genomic DNA of this transformant it was established that this transformant contains a 
single copy of the uidA gene. 

Aspergillus niger var. awamori (AW) and transformant AW15.7-1 were grown under 
the following conditions: shake flasks (500 ml) with 200 ml synthetic media (pH 6.5) 
were inoculated with spores (final concentration: lOE6/ml). 
25 The medium had the following composition (AW Medium): 



30 



sucrose 


10 


g/l 


NaNQ 3 . 


6.0 


g/l 


KCl 


0.52 


g/l 


KH 2 P0 4 


1.52 


g/l 


MgS0 4 -7H 2 0 


0.49 


g/l 


Yeast extract 


1.0 


g/l 


ZnS0 4 -7H 2 0 


22 


mg/1 


H3BO3 


11 


mg/1 


MnCl 2 -4H 2 0 


5 


mg/1 


FeS0 4 -7H 2 0 


5 


mg/1 


CaCl,-6H,0 


1.7 


mg/l 


CuS0 4 -5H,0 


1.6 


mg/1 


NaH 2 MoO 4 -2H,0 


• 1.5 


mg/1 


Na 2 EDTA 


50 


mg/1 
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20 

.ncubation took place a, 30«C. 200 rpm for 24 hours in a Mk X incubator shaker 
After growth ce„s were removed by filtration (0.4S urn f.ter), washed twtce w,th AW 
Medium without sucrose and yeas, extract (salt solution), resuspended in 50 ml salt 
solution and transferred to 300 ml shake flasks containing 50 ml salt solution to wh.ch 
xvlose has been added to a final concentration of 10 g/1 (induction medium). The 
moment of resuspension is referred to as "t=0" (star, of induction). Incubation took 
piace under the same conditions as described above. Samples were taken 15 and 
hours after induction. Biomass was recovered by filtration over miracloth, dned by 
squeezing and immediately frozen in liquid nitrogen. The mycelium was disrupted by 
.rindm. the frozen mycelium and B-glucuronidase activity was determined essennal.y 
as described in Roberts et al. (1989) 

From Table C it is evident that the exlA promoter is specifically induced by the 
presence of xvlose. and that the exlA promoter and terminator can be used for the 
production of £ colt 13-glucuronidase in transformant AW15.7-1. 



15 



20 




95 TahlP C. B. g | ../-..rnn.ri a se production 

" Transforms were grown on synthetic medium as indicated in the text for 24 hours 
and a, t=0 were transferred to induction medium as indicated in the text. B- 
Glucuronidase activity in the mycelium was determined as described in the text and » 
expressed in arbitrary units of enzymatic activity per milligram total protem. 



30 
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Example 3 Production and secretion of the Thermomyces lanuginosa lipase 

using the exIA promoter, signal sequence and terminator. 

3.1 Construction of expression plasmids based on the exlA expression signals 
3.1.1 Vector 

5 Plasmid pAW14A-Not was the starting vector for construction of a series of 

expression plasmids containing the exlA expression signals and the gene coding for 
Thermomyces lanuginosa lipase. Plasmid pAW14A comprises an Aspergillus niger var. 
awamori chromosomal 5 kb Sail fragment on which the 0.7 kb exlA gene is located, 
together with 2.5 kb of 5'-flanking sequences and 2.0 kb of 3'-flanking sequences 

10 (similar to pAW14B. see Fig. 3). In pAWMA the £coRI site originating from the 
pUCl9 polylinker was convened to a Noil site by insertion of a synthetic 
oligonucleotide (Eco-Not, see Table E), yielding pAW14A-Not. 
Starting from pAWl4A-Not, constructs were made in which the exlA promoter (2.5 
kb) was fused to the translation-initiation codon (ATG) of the Thermomyces 

15 lanuginosa lipase gene. Also, constructs were made in which the exlA promoter and 
the DNA sequence coding for the first 27 amino acids of the exlA protein, which is 
the preprosequence, was fused to the sequences coding for the mature lipase 
polypeptide. 

In both series of expression vectors the exlA transcription terminator was used. 

20 

The following vector fragments were isolated from pAW14A-Not and used for the 
constructions: 

* for the fusion with the translation-initiation codon of the lipase a 7.2 kb BspWV 
Aflll fragment was isolated from pAWl4A-Not. This is a similar fragment as the 

25 one isolated for the construction of pAW15-l (see example 2) and was isolated 

essentially by the same approach as described in Example 2. The fragment 
contains 2.5 kb nucleotide sequences comprising the exlA promoter up to the 
BspHl site which comprises the ATG codon, and 2.0 kb nucleotide sequences 
comprising the exlA transcription terminator starting with the Aflll site which 

30 comprises the stopcodon. 

* for the fusion of the exlA promoter and exlA pre-pro-peptide encoding region 
(the first 27 amino acids of the exlA gene) with the coding region of the mature 
lipase polypeptide, a partially digested 7.2 kb Nm\-Afl\\ fragment was isolated 
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from pAW14A-Not. This fragment contains 2.5 kb nucleotide sequences 
comprising the exIA promoter and the coding sequence of the first 26 amino 
acids of the exIA protein (preprosequence), which ends with a Nrul site, thus 
lacking only 1 amino acid of the exIA prosequence. Furthermore the fragment 
5 comprises 2 kb nucleotide sequences comprising the exIA transcription terminator 

starting with the AflW site. 

The Thermomyces lanuginosa lipase gene was isolated from vector pTL-1, which 
comprises a 0.9 kb coding region of the lipase gene (Figure 9) flanked by the 
10 Aspergillus niger glaA promoter and the Aspergillus nidulans trpC transcription 
terminator (Figure 8). 

■\ 1 ? Fusion of the liP^ aene wit h thp exIA transcription terminator sequence 
To obtain a fusion of the exIA transcription terminator with the lipase gene, an AflW 
15 site was created just downstream of the stopcodon of the lipase gene. In pTL-1 a 
HindWl site is present 5 base pairs downstream of the stopcodon of the lipase gene 
(Figures 8 and 9), in which an AflW site was created using a synthetic oligonucleotide. 
The construction was carried out as follows: 

P TL-1 was cut with HindWl, yielding a linear 8.3 kb fragment, which was isolated 
20 from an agarose gel and ligated with the oligonucleotide Hind-Afl (see Table E). 
In the resulting vector, pTLl-Aflll the HindlW site has disappeared and znAflll site 
has been created just downstream of the stopcodon of the lipase gene, thus preparing 
the lipase gene for fusion to the exIA terminator at the AflW site. 

25 3 1.3 Fusion of the exIA p r omoter with the, lipase gene. (ATG fusion) 

pTLl-AflH was used as starting vector to isolate a DNA fragment comprising the 
lipase gene. To fuse the lipase gene to the exIA promoter the region of the lipase 
gene comprising the ATG codon (ATATGA) was converted to aifcpffl site 
(TCATGA). This site still comprises the correct coding sequence of the lipase gene. 

30 For this purpose a synthetic DNA fragment was used, consisting of oligonucleotides 
BTFF09 and BTFF10 (see Table E) annealed to each other. This synthetic fragment 
contains aXhoI site for cloning, followed by a BspHl site comprising the ATG codon 
and the next 7 base pairs of the lipase gene up to the Sad site. 
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Vector pTLl-/4/7II was linearized by partial digestion with Xhol, followed by cutting 
with Sacl which cuts after position + 10 of the open reading frame encoding the lipase 
pre-pro-polypeptide. The 6.3 kb XhoUSacl vector fragment (resulting from a cut at 
the Xhol site in the glaA promoter while leaving the internal Xhol site in the lipase 
5 gene intact, see Figure 8) was isolated from an agarose gel and ligated with the 
synthetic Xhol-Sacl fragment resulting in vector pTLl-XS. From pTLl-XS a 0.9 kb 
BspHl-Aflll fragment comprising the lipase gene was isolated and ligated to the 7.2 
kb BspHl-Aflll fragment from pAW14A-Not yielding expression vector pAWTL-1 
(Fig.6). 

10 

3.1.4 Fusion of the exlA promoter and the region encoding the exlA prepro sequence 
with the coding sequence of the lipase mature protein 

pTLl-4/7II was used as starting vector to isolate a DNA fragment comprising the 
lipase gene. To obtain a correct fusion of the sequence encoding the lipase mature 

15 polypeptide with the exlA promoter sequence and the exlA leader peptide encoding 
sequences, a synthetic DNA fragment was used, consisting of oligonucleotides 
BTFF05 and BTFF06 (see Table E) annealed to each other. This synthetic fragment 
comprises sequences encoding the last amino acid of the exlA pre-pro-sequence fused 
to the first 12 codons of the mature lipase encoding sequence. It contains a Xhol site 

20 for cloning and a Nrul site, which comprises the last 3 base pairs of the exlA 

prosequence. The fragment ends with a Bglll site. Vector pTLl-4/7II was linearized 
by partial digestion with Xhol, followed by cutting with Bglll, which cuts just within 
the region coding for the mature lipase. The 6.3 kb XhoI-BglU vector fragment 
(resulting from a cut at the Xhol site in the glaA promoter while leaving the internal 

25 Xhol site in the lipase gene intact, see Figure 8) was isolated from an agarose gel and 
ligated whh the synthetic Xhol-Bglll fragment, resulting in pTLl-XB. From pTLl-XB 
an 0.83 kb NmVAflll fragment was isolated containing the last 3 base pairs of the 
exlA prosequence followed by the sequence encoding the mature lipase polypeptide 
up to the Aflll site just beyond the stop codon (see example 2). This fragment was 

30 ligated with the 7.2 kb Nrul-Aflll fragment of pAW14A-Not to give expression vector 
pAWTL2 (Fig.7). 
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^ •? Prnrinrtinn and section of Th e mnmvr.es lammnosa lipase using the exIA 
promoter and terminator. 

TT,e constructed expression vectors (pAWTLl and pAWTL2) can subsequently be 
transferred to moulds (for example Aspergillus niger, Aspergillus niger var. awamori, 
5 Aspergillus niclulans etc.) by means of conventional co-transformation techniques and 
the lipase can then be expressed via induction of the endoxylanase II promoter. The 
constructed vector can also be provided with conventional selection markers (e.g. 
amdS or pyrG. hygromycin etc.) and the mould can be transformed with the resulting 
vector to produce the desired protein, essentially as described in example 2. As an 

10 example, plasrnids were derived from pAWTLl and pAWTLZ by introduction of an 
Aspergillus niger var. awamori pyrG gene, and the resulting plasrnids were introduced 
in strain AWPYR. an Aspergillus niger var. awamon strain derived from strain CBS 
115.52 (ATCC 11358) in which the pyrG gene has been disrupted. Following this 
route, transformant AWLPL1-2 was derived using the pAWTLl plasmid, whereas 

15 transformant AWLPL2-2 was derived starting from the pAWTL2 plasmid. 

Transformant AWLPL1-2 (containing the Tliermomyces lanuginosa mature lipase 
encoding region with the Tliermomyces lanuginosa signal sequence under the control 
of Aspergillus niger var. awamori exIA promoter and terminator) and transformant 

20 AWLPL2-2 (containing the Tliermomyces lanuginosa mature lipase encoding region 
with the endoxylanase signal sequence under the control the Aspergillus niger awamori 
exIA promoter and terminator) were grown in shake flasks on AW Medium as 
described in example 2. Incubation took place at 30°C, 200 rpm for 24 hours in a Mk 
X incubator shaker. After growth cells were collected by filtration (0.45 urn filter), 

25 washed twice with AW Medium without sucrose and yeast extract (salt solution), 
resuspentied in 50 ml salt solution and transferred to 300 ml shake flasks containing 
50 ml salt solution to which xylose has been added to a final concentration of 10 g/1 
(induction medium). The moment of resuspension is referred to as "t=0" (start of 
induction). Incubation took place under the same conditions as described above. 

30 Samples were taken 15, 22 and 39 hours after induction. Samples were filtered over 
miracloth to remove biomass and the filtrate was analyzed for lipase activity by a 
titrimetric assay using olive oil as a substrate. 
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For each sample between 100 and 200 \i\ of filtrate was added to a stirred mixture of 
5.0 ml lipase substrate (Sigma, containing olive oil as a substrate for the lipase) and 
25.0 ml of buffer (5 mM Tris-HCl pH 9.0, 40 mM NaCl, 20 mM CaCI 2 ). The assay 
was carried out at 30°C and the release of fatty acids was measured by automated 
titration with 0.05 M NaOH to pH 9.0 using a Mettler DL25 titrator. A curve of the 
amount of titrant against time was obtained. The amount of lipase activity contained 
in the sample was calculated from the maximum slope of this curve. One unit of 
enzymatic activity is defined as the amount of enzyme that releases 1 jimol of fatty 
acid from olive oil in one minute under the conditions specified above. Such 
determinations are known to those skilled in the art. 

The results are presented in Table D. From these results it is obvious that functional 
lipase is produced and secreted after induction of the exlA promoter by xylose, and 
that the exlA signal sequence can direct the secretion of heterologous proteins in 
Aspergillus niger var. awamorL 



Strain 


exp 


t = 0 


t= 15 


t = 22 


t = 39 


AWLPL1-2 


A 


3.2 


76 


65 


59 


AWLPL1-2 


B 


7 


84 


32 


35 


AWLPL2-2 


A 


9 


77 


50 


49 . 


AWLPL2-2 


B 


8 


72 


51 


46 


AW 


A 


7 


9 


8 


8 


AW 


B 


6 


7 . 


7 


8 



Table D. Production and secretion of lipase 

Transformants were grown on synthetic medium as indicated in the text for 24 hours 
and at t = 0 were transferred to induction medium as indicated in the text. Lipase 
activity in the medium was determined by a titrimetric assay using olive oil as 
substrate and is expressed in arbitrary units of lipase activity. A and B represent 
duplo experiments. 
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BTFF05 

BTFF06 

BTFF09 

BTFF10 

Eco-Not 

Hind-Afl 

Nco-Afl 

Xylll 



5'-TCGAGTCGCGAGAGGTCTCGCAA-3' 

5'-GATCTTGCGAGACCTCTCGCGAC-3' 

5-TCGAGCGTCATGAGGAGCT-3' 

5 ; -CCTCATGACGC-3' 

5'-AATTGCGGCCGC-3' 

5'-AGCTCGCTTAAGCG-3' 

S'-CATGCCTTAAGGo' 

5'-GCATATGATTAAGCTGC-3' 



sequence listing 5 
sequence listing 6 
sequence listing 7 
sequence listing 8 
sequence listing 9 
sequence listing 10 
sequence listing 11 
sequence listing 12 



n y*. p Nm*»Hd. sequen ces of oV^uq^niimnsmslXM^m 
Sequence listing numbers refer ,o the listings provided in the official format. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT; 

(A) NAME: Unilever N.V. 

(B) STREET: Weena 455 

(C) CITY: Rotterdam 

(E) COUNTRY: The Netherlands 

(F) POSTAL CODE (ZIP): NL-3013 AL 



(A) NAME: Unilever PLC 

(B) STREET: Unilever House Blackfriars 

(C) CITY: London 

(E) COUNTRY: United Kingdom 

(F) POSTAL CODE (ZIP) : EC4P 4BQ 

(ii) TITLE OF INVENTION: Process for the production of a protein using 
endoxylanase II (exlA) expression signals 

(iii) NUMBER OF SEQUENCES: 12 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EFOJ 

(v) CURRENT APPLICATION DATA: 
APPLICATION NUMBER: 



(2) - INFORHATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL : NO 



(iv) ANTI-SENSE: NO 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Aspergillus niger var. awamorx 

(B) STRAIN: CBS 115.52 (ATCC 11358) 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: pAW14B 

(ix) FEATURE: 

(A) NAME/KEY: intron 

(B) LOCATION: 581.. 629 

(C) IDENTIFICATION METHOD: experimental 

(D) OTHER INFORMATION: /evidence- EXPERIMENTAL 
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(ix) FEATURE: 

(A) NAME/KEY: promoter 

(B) LOCATION: 1.. 350 

(C) IDENTIFICATION METHOD: experimental 

(D) OTHER INFORMATION : /evidence- . EXPERIMENTAL 

(ix) FEATURE: 

(A) NAME/KEY: sig_peptide 

(B) LOCATION: 351.. 431 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: j oin(432 . . 580 . 630.. 1032) 

(C) IDENTIFICATION METHOD: experimental 

(D) OTHER INFORMATION: /EC_number« 3.2.1.8 

/product^ "endoxylanase II" 
/evidence- EXPERIMENTAL 
/gene- "exlA" 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: join(351 . . 580 , 630.. 1035) 

(C) IDENTIFICATION METHOD: experimental 

(D) OTHER INFORMATION: /EC_number= 3.2.1.8 

/product-* "pre-pro endoxylanase II" 
/evidence- EXPERIMENTAL 
/gene- "exlA" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GCCCTTTTA TCCGTCTGCC GTCCATTTAG CCAAATGTAG TCCATTTAGC CAAGTGCGGT 

CATTTAGCC AAGACCAGTG GCTAGATTGG TGGCTACACA GCAAACGCAT GACTGAGACA 

AACTATAGG ACTGTCTCTG GAAATAGGCT CGAGGTTGTT CAACCGTTTA AGGTGATGCG 

CAAAATGCA TATGACTAAG CTGCTTCATC TTGCAGGGGG AAGGGATAAA TAGTCTTTTT 

GCAGAATAT AAATAGAGGT AGAGTGGGCT CGCAGCAATA TTGACCAGCA CAGTGCTTCT 

TTCCAGTTG CATAAATCCA TTCACCAGCA TTTAGCTTTC TTCAATCATC ATG AAG 

Met Lys 
-27 

# 

TC ACT GCG* GCT TTT GCA GGT CTT TTG GTC ACG GCA TTC GCC GCT CCT 
al Thr Ala Ala Phe Ala Gly Leu Leu Val Thr Ala Phe Ala Ala Pro 
25 -20 -15 -10 

TG CCG GAA CCT GTT CTG GTG TCG CGA AGT GCT GGT ATT AAC TAC GTG 
al Pro Glu Pro Val Leu Val Ser Arg Ser Ala Gly He Asn Tyr Val 



AA AAC TAC AAC GGC AAC CTT GGT GAT TTC ACC TAT GAC GAG AGT GCC 
In Asn Tyr Asn Gly Asn Leu Gly Asp Phe Thr Tyr Asp Glu Ser Ala 



60 
120 
180 
240 
300 
356 

404 

452 

500 



20 
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Tyr Gly Asp Tyr Asn Pro Cys Ser Ser Ala Thr Ser Leu Gly Thr Val 

Tyr Ser Asp Gly Ser Thr Tyr Gin Val Cys Thr Asp Thr Arg Thr Asn 

105 110 
Glu Pro Ser lie Thr Gly Thr Ser Thr Phe Thr Gin Tyr Phe Ser Val 



120 125 



Arg Glu Ser Thr Arg Thr Ser Gly Thr Val Thr Val Ala Asn His Phe 
135 140 145 

Asn Phe Trp Ala Gin His Gly Phe Gly Asn Ser Asp Phe Asn Tyr Gin 
150 155 160 

Val Met Ala Val Glu Ala Trp Ser Gly Ala Gly Ser Ala Ser Val Thr 
170 L75 



lie Ser Ser 
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TCCAGATATT CTATACTAAC AGACTTCTAA TGACTGCGGA TAATATAGAG GGCAAGAATT 144 2 

TCTACAGTTC GACGCAGTTC AACGCAATCA GAGAGGGAAT ACTGATGAGA GTGCAATCAG 1502 

TTAGAGAAGG ACAACATGGC AGTCTTAGTG TGAACTTACA TAACGATATG GACTCTAGAA 1562 

AAAAGGAAGG AGCTCCGTCT ATATATAGCG CCATTACGTG TATCTGATGC TTGCCCATTG 1622 

CCACTGGGTA GGGTGACTTT TTGAAGCGAC TCGACATATA ATATGACAAA CTCATGCCCC 1682 

CTTTGCAGGA AACTTAGCTT TTCCTGCCTT GCTTTGAAGC CACAATTATC ACGAAACTCA 1742 

TTTAGAGATT TATCTTCCTG TAACGGAAAC AAATATTTCG GGATTGGAAT AGCCTTTTGC 1802 

CGAACTCATT ATTTTTTTGC GACGGTAAAT CTGGGAGTAT ACGATGTCCT TTCACGTTTC 1862 

TCAACAAAAC TCTGCCGCAC CGGGT AACCT ACGGATAGTA CTGTATCCAG ACTCAGTTTT 1922 

TCTAATAACA GGACACTGTG CAATTTGCGG GAAAATTCCT ATGTATATTA CTTTCTCGTT 1982 

GCATCTCAAA TATTGTGGCT TTTTGAGACC CACACTATGT CTTGCACATA TTGTACCATC 2042 

CTTGCTTGAG GCCAATT 2059 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 211 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Men Lys Val Thr Ala Ala Phe Ala Gly Leu Leu Val Thr Ala Phe Ala 
-27 -25 -20 -15 

Ala Pro Val Pro Glu Pro Val Leu Val Ser Arg Ser Ala Gly lie Asn 
-10 -5 1 5 

Tyr Val Gin Asn Tyr Asn Gly Asn Leu Gly Asp Phe Thr Tyr Asp Glu 
10 15 20 

Ser Ala Gly Thr Phe Ser Met Tyr Trp Glu Asp Gly Val Ser Ser Asp 
25 30 35 

# 

Phe Val Val Gly Leu Gly Trp Thr Thr Gly Ser Ser Asn Ala He Thr 
40 45 50 

Tyr Ser Ala Glu Tyr Ser Ala Ser Gly Ser Ser Ser Tyr Leu Ala Val 
55 60 65 

Tyr Gly Trp Val Asn Tyr Pro Gin Ala Glu Tyr Tyr He Val Glu Asp 
70 75 80 85 
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(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 886 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(ix) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 1..876 

(C) IDENTIFICATION METHOD: experimental 

(D) OTHER INFORMATION: /product- "Thermomyces lanuginosa 

pre-pro lipase" 
/evidence= EXPERIMENTAL 



(ix) FEATURE: 

(A) NAME/KEY: raat_peptide 

(B) LOCATION: 67. .873 

(C) IDENTIFICATION METHOD: experimental 

(D) OTHER INFORMATION: /product- "Thermomyces lanuginosa 

"lipase" 

/evidence- EXPERIMENTAL 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:3: 

ATG AGG AGC TCC CTT GTG CTG TTC TTT GTC TCT GCG TGG ACG GCC TTG 48 

Met Arg Ser Ser Leu Val Leu Phe Phe Val Ser Ala Trp Thr Ala Leu 
-22 -20 -15 " 10 

GCC AGT CCT ATT CGT CGA GAG GTC TCG CAA GAT CTG TTT AAC CAG TTC 96 

Ala Ser Pro He Arg Arg Glu Val Ser Gin Asp Leu Phe Asn Gin Phe 



-5 1 



AAT CTC TTT GCA CAG TAT TCT GCT GCC GCA TAC TGC GGA AAA AAC AAT 
Asn Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn 

20 25 



144 



15 



GAT GCC CCA GCT GGT ACA AAC ATT ACG TGC ACG GGA AAT GCC TGC CCC 192 
Asp Ala Pro Ala Gly Thr Asn He Thr Cys Thr Gly Asn Ala Cys Pro 
30 35 *0 

GAG GTA GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG TTT GAA GAC TCT 240 
Glu Val Glu Lys Ala Asp Ala Thr Phe Leu Tyr Ser Phe Glu Asp Ser 
45 \ 50 55 

GGA GTG GGC GAT GTC ACC GGC TTC CTT GCT CTA GAC AAC ACG AAC AAA 288 
Gly Val Gly Asp Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys 
60 65 70 

TTG ATC GTC CTC TCT TTC CGT GGC TCT CGT TCC ATA GAA AAC TGG ATC 336 
Leu He Val Leu Ser Phe Arg Gly Ser Arg Ser He Glu Asn Trp lie 
75 80 85 90 
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GGA AAT CTT AAC TTC GAC TTG AAA GAA ATA AAT GAC ATT TGC TCC GGC 384 
Gly Asn Leu Asn Phe Asp Leu Lys Glu lie Asn Asp lie Cys Ser Gly 
95 100 105 

TGC AGG GGA CAT GAC GGC TTC ACC TCG AGC TGG AGG TCT GTA GCC GAT 4 32 

Cys Arg Gly His Asp Gly Phe Thr Ser Ser Trp Arg Ser Val Ala Asp 
110 115 120 

ACG TTA AGG CAG AAG GTG GAG GAT GCT GTG AGG GAG CAT CCC GAC TAT 480 
Thr Leu Arg Gin Lys Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr 
125 130 135 

CGC GTG GTG TTT ACC GGA CAT AGC TTG GGT GGT GCA TTG GCA ACT GTT 528 
Arg Val Val Phe Thr Gly His Ser Leu Gly Gly Ala Leu Ala Thr Val 
140 145 150 

GCC GGA GCA GAC CTG CGT GGA AAT GGG TAT GAC ATC GAC GTG TTT TCA 576 
Ala Gly Ala Asp Leu Arg Gly Asn Gly Tyr Asp lie Asp Val Phe Ser 
155 160 165 170 

TAT GGC GCC CCC CGA GTC GGA AAC AGG GCT TTT GCA GAA TTC CTG ACC 624 
Tyr Gly Ala Pro Arg Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr 
175 180 185 

GTA CAG ACC GGC GGT ACC CTC TAC CGC ATT ACC CAC ACC AAT GAT ATT 672 
Val Gin Thr Gly Gly Thr Leu Tyr Arg lie Thr His Thr Asn Asp He 
190 195 200 

GTC CCT AGA CTC CCG CCG CGC GAG TTC GGT TAC AGC CAT TCT AGC CCA 720 
Val Pro Arg Leu Pro Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro 
205 210 215 

GAG TAC TGG ATC. AAA TCT GGA ACC CTT GTC CCC GTC ACC CGA AAC GAC 768 
Glu Tyr Trp He Lys Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp 
220 225 230 

ATC GTG AAG ATA GAA GGC ATC GAT GCC ACC GGC GGC AAT AAC CAG CCT 816 
He Val Lys He Glu Gly He Asp Ala Thr Gly Gly Asn Asn Gin Pro 
235 240 245 250 

AAC ATT CCG GAT ATC CCT GCG CAC CTA TGG TAC TTC GGG TTA ATT GGG 864 
Asn He Pro Asp He Pro Ala His Leu Trp Tyr Phe Gly Leu He Gly 
255 260 265 

ACA TGT CTT TAGTGCGAAG CTT 886 
Thr Cys Leu 

#270 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 291 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: procein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:4: 
Met: Arg Ser Ser Leu Val Leu Phe Phe Val Ser Ala Trp Thr Ala Leu 
-22 -20 -i* 

Ala Ser Pro lie Arg Arg Glu Val Ser Gin Asp Leu Phe Asn Gin Phe 

-5 1 
Asn Leu Phe Ala Gin Tyr Ser Ala Ala Ala Tyr Cys Gly Lys Asn Asn 
15 20 

Asp Ala Pro Ala Gly Thr Asn He Thr Cys Thr Gly Asn Ala Cys Pro 

30 35 
Glu Val Glu Lys Ala Asp Ala Thr Phe Leu Tyr Ser Phe Glu Asp Ser 

45 50 5 

Gly Val Gly Asp Val Thr Gly Phe Leu Ala Leu Asp Asn Thr Asn Lys 

60 65 
Leu He Val Leu Ser Phe Arg Gly Ser Arg Ser He Glu Asn Trp lie 
75 80 85 

Gly Asn Leu Asn Phe Asp Leu Lys Glu II. Asn Asp He Cys Ser Gly 

95 J-vU 

Cys Ars Gly His Asp Gly Phe Thr Ser Ser Trp Arg Ser Val Ala Asp 

110 115 
Thr Leu Arg Gin Lys Val Glu Asp Ala Val Arg Glu His Pro Asp Tyr 

125 130 
Arg Val Val Phe. Thr Gly His Ser Leu Gly Gly Ala Leu Ala Thr Val 

140 145 150 

Ala Gly Ala Asp Leu Arg Gly Asn Gly Tyr Asp He Asp Val Phe Ser 
155 160 

Tyr Gly Ala Pro Arg Val Gly Asn Arg Ala Phe Ala Glu Phe Leu Thr 
175 180 

Val Gin Thr Gly Gly Thr Leu Tyr Arg He Thr His Thr Asn Asp He 



190 



195 



Val Pro ATS Lea Pro Pro Arg Glu Phe Gly Tyr Ser His Ser Ser Pro 

205 210 
Glu Tyr Trp He Lys Ser Gly Thr Leu Val Pro Val Thr Arg Asn Asp 

220 225 
He Val Lys lie Glu Gly He Asp Ala Thr Gly Gly Asn Asn Gin Pro 
235 240 

_ — _ Ala u n -c: Leu Trp Tyr Phe Gly Leu lie Gly 
Asn He Pro Asp He Pro Ala His Leu irp ly j 

255 260 

Thr Cys Leu 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TCGAGTCGCG AGAGGTCTCG CAA 23 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GATCTTGCGA GACCTCTCGC GAC 23 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TCGAGCGTCA TGAGGAGCT 
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CLAIMS 



1 Process for the production and optionally secretion of a protein by means 

of a transformed mould, into which an expression vector has been introduced with the 
aid of recombinant DNA techniques known per se, said vector comprising one or 
more mould-derived expression and/or secretion regulating regions, 
in which process at least one of said expression and/or secretion regulating regums is 
selected from 

(1) the expression and secretion regulating regions of the endoxylanase II gene (exlA 
•ene) of AspagiUus ni S er var. alamort present on plasmid P AW14B (Ftgure 3), 
which is present in a transformed E. coU strain JM109 deposited at the Centraal- 
bureau voor Schimme.cultures in Baarn. The Netherlands, under N° CBS 237.90 
on 31 Mav 1990, and 

(2) functional derivatives thereof also having expression and/or secretion regulatmg 
activity, 

with the proviso that said protein is different from the endoxylanase type II ex 
Aspergillus niger var. awamori. 

Process according to Claim 1. in which process the selected expression 
regulating region is a promoter and said vector comprises a gene encoding smd 
protein under control of said promoter, the latter being selected from 

(1) the promoter of the endoxylanase II gene (exlA gene) of Aspergillus niger var. 
awamori present on plasmid P AW14B (Figure 3), which is present m a 
transformed E. coli strain JM109 deposited at the Centraalbureau voor 
Sch«nmelcu«tures in Baam, The Netherlands, under N° CBS 237.90 on 31 May 
1990, and 

(2) functional derivatives thereof also having promoter activity. 

3 Process according to Claim 2, in which said promoter is equal to the 

promoter present on the 5' part upstream of the ex.A gene having a size of about 2.5 
kb located between the MI restriction site at position 0 and the start codon ATG of 
the exlA gene in plasmid pAW14B. 8UBST|TUTE SHEET 
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4. Process according to Claim 2. in which said promoter comprises at least 
the polynucleotide sequence 1-350 according to Figure 1. 

5. Process according to Claim 2, in which said promoter is induced by wheat 
bran, xylan, or xylose, or a mixture of any combination thereof, present in a medium 
in which the transformed mould is incubated. 

6. Process according to claim 5, in which said promoter is induced by xylose 
present in the medium in which the transformed mould is incubated. 

7. Process according to Claim 1. in which process the selected expression 
regulating region is a terminator and said vector comprises a gene encoding said 
protein followed by said terminator, the latter being selected from 

(1) the terminator of the endoxylanase II gene (exlA gene) of Aspergillus niger var. 
awamori present on plasmid pAWl4B (Figure 3), which is present in a 
transformed E. coli strain J Ml 09 deposited at the Centraalbureau voor 
Schimmelcultures in Baarn. The Netherlands, under N° CBS 237.90 on 31 May 
1990, and 

(2) functional derivatives thereof also having terminator activity. 

8. Process according to Claim 7, in which said terminator is equal to the 
terminator present on the 3' part downstream of the exlA gene having a size of about 
1.0 kb located right downstream of the stop codon (TAA) of the exlA gene in plasmid 
pAWl4B. 

9. Process according to Claim 7, in which said vector also comprises a 
promoter as claimed in Claim 2. 
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10 Process according to Claim 1 for the production and secretion of a protein 

bv means of a transformed mould, in which process the selected secretion regulating 
region is a DNA sequence encoding a signal sequence and said vector comprises a 
gene encoding said protein preceded by said DNA sequence encoding a signal 
sequence, the latter being selected from 

(1) the DNA sequence encoding the signal sequence of the endox 7 lanase II gene 
(exlA gene) of Aspergillus niger var. awanwri present on plasmid P AW14B (Figure 
3), which is present in a transformed E. coli strain JM109 deposited at the 
Centraalbureau voor Schimmelcultures in Baarn, The Netherlands, under N° CBS 
237.90 on 31 May 1990, and 

(2) functional derivatives thereof also directing secretion of the protein. 

11. Process according to Claim 10, in which the gene (1) encoding said protein 
is also preceded by at least an essential part of a DNA sequence (2) encoding the 
mature endoxylanase II protein, whereby said DNA sequence (2) is present between 
said DNA sequence encoding a signal sequence (3) and the gene (1). 

12. Process according to Claim 10. in which said signal sequence is the signal 
sequence encoded by polynucleotide 351-431 of the DNA sequence given in Figure 1, 
which polynucleotide precedes the exlA gene in plasmid pAW14B. 

13. Process according to Claim 10. in which said vector also comprises a 
promoter as claimed in Claim 2 or a terminator as claimed in Claim 7, or both. 
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Fig. 1. 



10 30 50 

AGcCCTTTTA TCCgTCTgcc gTCcATTTAG CCAAATGTAG TCCaTTTAGC CAACTCCCCT 

70 90 110 

CCATTTAGCC AAGACCACTG GCTAGATTGG TGGCTACACA GCAAACGCAT GACTCAGACA 

130 150 170 

CAACTATAGG actgtctctg caaataccct cgaggttgtt caagcgttta aggtgatgcg 

190 210 230 

GCAAAATGCA TATGAGTAAG CTGCTTcATC ttccagcccc aagcgataaa tactcttttt 

250 270 290 

cgcagaatat aaatagaggt agagtgggct cgcagcaata ttgaccagca cactgcttct 

310 330 350 

CTTCCAGTTC CATAAATCCA TTCACCAGCA TTTAGCTTTC TTCAATCATC ATC AAG GTC 

M K V 

380 400 
ACT GCG GCT TTT GCA GGT CTT TTG CTC ACG GCA TTC GCC GCT CCT GTG CCC 
TAAFACLLVTAFAAP. VP 

420 440 460 

GAA CCT GTT CTG GTG TCG CGA ACT GCT GGT ATT AAC TAC GTG CAA AAC TAC 
EPVLVSRSAGINYVQNY 

(-> macure xylanase 

480 500 
AAC GGC AAC CTT GGT GAT TTC ACC TAT GAC GAG ACT GCC GGA ACA TTT TCC 
NGNLCDF TYDESAGTFS 

520 540 560 

ATG TAC TGG GAA GAT CGA GTG AGC TCC GAC TTT CTC GTT GGT CTC GGC TCG 
MYWEDGVSSDFVVGLGW 

580 600 620 

ACC ACT GGT TCT TCT AA CTGACTCACT GTATTCTTTA ACCAAACTCT AGGATCTAAC 
T T G S S N 

640 660 
GTTTTCTAG C GCT ATC ACC TAC TCT GCC GAA TAC ACT GCT TCT GGC TCC TCT 
AITYSAEYSASGSS 

680 700 720 

tcc tac ctc gct gtg tac ggc tgg gtc aac tat cct cag gct gaa tac tac 
sylavygwvny pqa ey y 

740 760 
ATC GTC GAG GAT TAC GGT GAT TAC AAC CCT TCC AGC TCG GCC ACA AGC CTT 
IV^EDYCDYNPCSSATS L 

780 800 820 

GGT ACC GTG TAC TCT GAT GGA AGC ACC TAC CAA GTC TGC ACC GAC ACT CGA 
GTVYSDCSTYQVCTDTR 

840 860 
ACT AAC CAA CCC TCC ATC ACG GGA ACA AGC ACC TTC ACC CAG TAC TTC TCC 
TNEPSITCT S T FTQYFS 

880 900 920 

GTT CGA CAG AGC ACG CCC ACA TCT GGA ACC CTC ACT CTT GCC AAC CAT TTC 
VRESTRTSCTVTVANHF 
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Fig. Kcont) 



940 960 
AAC TTC TGG GCG CAG CAT GGG TTC GGA AAT AGC GAC TTC AAT TAT CAG GTC 
NFWAQHGFGNS DFNYQV 

980 1000 1020 

ATG GCA GTG GAA CCA TGG AGC GGT GCT GGC AGC GCC AGT GTC ACG ATC TCC 
MAVEAWSGAGS asvtis 

1050 1070 1090 

TCT TAA CGGAT AAGTGCCTTG GTAGTCGGAA GATGTCAACG CGGAACTTTG TTCTCACCTC 

s * 

1110 1130 1150 

GTGTGATGAT CGCATCCCCC CTCTGGTGGT TACATTGAGG CTGTATAACT TATTCTGGGG 

1170 1190 1210 

CCGAGCTGTC AGCGGCTGCG TTTCCAATTT GCACAGATAA TCAACTTTCg TTTTCTATCT 

1230 1250 1270 

CTTGCGTTTC CACGCTGTTT ATCCTATCCA TAGATAATAT cTTgCCCAAT* ACATATTATC 

1290 1310 1330 

TATATACAAC TTGTTCGGTC GCAGTAGTCA CTCCGAGCAA GGCATTGGGA AATTGGGAGA 



1350 1370 

TGCGGGCTGC TGCGTACGCT CTAAGCTAGG GCATTTAAAG 

1410 1430 

TTCTATACTA ACAGACTTCT AATGACTGCG GATAATATAG 

1470 1^90 

TCGACGCAGT TCAACGCAAT CAGAGAGGGA AT ACTG ATG A 

1530 1550 

GGACAACATG GCACTCTTAG TCTGAACTTA CATAACGATA 



1390 

GGATATTTAG CCTCCAGATA 

1450 

AG GGC AAGAA TTTCTACAGT 

1510 

GAGTGCAATC AGTTAGAGAA 

1570 

TGGACTCTAG AAAAAAGCAA 



1590 1610 1630 

GGAGCTCCGT CTATATATAG CGCCATTACG TGTATCTGAT GCTTCCCCAT TGCCACTCGC 

1650 1670 1690 

TAGGGTGACT TTTTCAAGCC ACTCGACATA TAATATGACA AACTCATGCC CCCTTTGCAG 

1710 1730 1750 

GAAACTTAGC TTTTCCTGCC TTGCTTTGAA GCCACAATTA TCACGAAACT CATTTAGACA 

1770 1790 1810 

TTTATCTTCC TCTAACGGAA ACAAATATTT CGGGATTGGA ATAGCCTTTT GCCGAACTCA 

1830 1850 1870 

TTATTTTTTT GCGACGGTAA ATCTGGGAGT ATACGATCTC CTTTCACGTT TCTCAACAAA 

1890 1910 1930 

ACTCTGCCGC ACCGGCTAAC CTACGGATAG TACTGTATCC AGACTCAGTT TTTCTAATAA 

1950 1970 1990 

CAGCACACTG TGCAATTTGC GGG AAAATTC CTATGTATAT TACTTTCTCG TTGCATCTCA 

2010 2030 2050 

AATATTCTCC CTTTTTGAGA CCCACACTAT GTCTTGCACA TATTGTACCA TCCTTGCTTC 

2059 
ACGCCAATT 
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Fig. 1 
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Fig. 5. 
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Fig. 8. 
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Fig. 9. 



1 ATG AGG AGC TCC CTT GTG CTG TTC TTT GTC TCT GCG TCC ACC GCC TTG 
MRSSLVLFFVSAUTAL 

49 GCC AGT CCT ATT CGT CGA GAG GTC TCG CAA GAT CTG TTT AAC CAG TTC 
ASPIRREVSQDLFNQF 

(-> mature lipase 

9 7 AAT CTC TTT GCA CAG TAT TCT GCT GCC GCA TAC TGC GGA AAA AAC AAT 
NLFAQYSAAAYCCKNN 

145 GAT GCC CCA GCT GGT ACA AAC ATT ACG TGC ACG GGA AAT GCC TGC CCC 
DAPAGTNITCTGNACP 

193 GAG GTA GAG AAG GCG GAT GCA ACG TTT CTC TAC TCG TTT CAA GAC TCT 
EVEKAD -ATFLYS FE DS 

241 GGA GTG GGC GAT GTC ACC GGC TTC CTT GCT CTA GAC AAC ACG AAC AAA 
GVGDVTGFLALDNTNK 

289 TTG ATC GTC CTC TCT TTC CGT GGC TCT CGT TCC ATA GAA AAC TGG ATC 
LIVLSFRGSRS I ENWI 

337 GGA AAT CTT AAC TTC GAC TTG AAA GAA ATA AAT GAC ATT TGC TCC GGC 
GNLNFDLKEINDICSC 

385 TGC AGG GGA CAT GAC GGC TTC ACC TCG AGC TGG AGG TCT GTA GCC GAT 
CRGHD GFTSSVRSVA D 

433 ACG TTA AGG CAG AAG GTG GAG GAT GCT GTG AGG GAG CAT CCC GAC TAT 
TLRQKVEDAVREHP DY 

431 CGC GTG GTG TTT ACC GGA CAT AGC TTG GGT GGT GCA TTC GCA ACT CTT 
RVVFTGHSLGGALATV 

529 GCC GGA GCA GAC CTG CGT GGA AAT GGG TAT GAC ATC GAC GTG TTT TCA 
AGADLRGNGYDIDVFS 

577 TAT GGC GCC CCC CGA GTC GGA AAC AGG GCT TTT GCA GAA TTC CTG ACC 
YGAPRVGNRAFAEF LT 

625 GTA CAG ACC GGC GGT ACC CTC TAC CGC ATT ACC CAC ACC AAT GAT ATT 
VQTGG TLYRITHTN DI 

673 GTC CCT AGA CTC CCG CCG CGC GAG TTC GGT TAC AGC CAT TCT AGC CCA 
VPRLPPR EFGYSHSSP 

721 GAG TAC TGG ATC AAA TCT GGA ACC CTT GTC CCC GTC ACC CGA AAC GAC 
EYWIKSGTLVPVTRND 

769 ATC GTG AAG ATA GAA GGC ATC GAT GCC ACC GGC GGC AAT AAC CAG CCT 
IVKIEGIDATCGNNQP 

817 AAC ATT CCG GAT ATC CCT GCG CAC CTA TGC TAC TTC GGG TTA ATT GGG 
NIPDIPAHLWYFG L1C 

865 ACA TGT CTT TAG TGCGAAGCTT 886 
T C L 
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